RANDOM MATRICES, NON-BACKTRACKING 
WALKS, AND ORTHOGONAL POLYNOMIALS 



SASHA SODIN 

Abstract. Several well-known results from the random matrix 
theory, such as Wigner's law and the Marchenko-Pastur law, can 
be interpreted (and proved) in terms of non-backtracking walks 
on a certain graph. Orthogonal polynomials with respect to the 
limiting spectral measure play a role in this approach. 



1. Introduction 

Our goal is to explain a unified approach to the proofs of several 
well-known theorems in the spectral theory of random matrices and 
random graphs. Some of these results are formulated further in the 
introduction; striving to make the main idea as clear as possible, we 
restrict ourselves to paradigmatic examples. In particular, we only con- 
sider Bernoulli random matrices, although most proofs can be adapted 
to arbitrary random variables under mild assumptions on tail decay. 

The method may be seen as a modification of the moment method; in 
the latter, used extensively since Wigner, spectral properties of a ma- 
trix M are extracted from the traces trM fc of powers of M. Instead, 
we propose to estimate tr Pfc(M), where P& are orthogonal polynomi- 
als with respect to a certain measure a, which is the candidate for 
limiting spectral measure. Perhaps surprisingly, these numbers have, 
in some cases, a simple combinatorial interpretation, in terms of non- 
backtracking walks (see Subsection 12. 3j) on an appropriate graph. 

One can also start from a linear recurrent relation of order two for the 
number of non-backtracking walks. Then a measure a appears from the 
correspondence between Jacobi (tridiagonal) matrices and measures on 
R. This classical correspondence involves the orthogonal polynomials 
Pfc with respect to a, that satisfy the same recurrent relation. In fact, 
we will see (see e.g. Lemma ET!T|) that the matrix P^(M) is closely related 
to non-backtracking walks of length k. 

Now it is natural to guess that a is the limiting spectral measure. 
We show that this is the case if the traces tr Pfc(M) do not grow too 
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fast; the proof is based on an analytic lemma (cf. Subsection 15. 2D . The 
combinatorial estimates (in Section [H]) allow to bound these traces, for 
the examples that we consider. 

Although orthogonal polynomials do not appear explicitly in the 
work of Bai and Yin on the smallest singular value of a random covari- 
ance matrix J5], the present note (as well as part of the previous work 
[I]) started from an attempt to understand and generalise their proof. 

Similar ideas emerged also in the spectral graph theory, starting 
from the work of McKay [TjJJ ED]- McKay derived an expression for 
the number of non-backtracking walks on a graph in terms of certain 
polynomials of the adjacency matrix from a certain recurrent relation 
and applied it to study the spectral measure of (i-regular graphs; Fried- 
man [8] applied it to study the spectral gap of random graphs. Li and 
Sole [T7] noted that these are exactly the orthogonal polynomials with 
respect to the Kesten-McKay measure ([7j), and suggested to consider 
more general measures of the Bernstein-Szego class (see Section [5TTj) . 
They also used the Chebyshev-Markov-Stieltjes inequalities (cf. Sub- 
section EH]) • Related methods were developed by Brooks [7] and Serre 

We try to emphasise the applications to matrices other than the 
adjacency matrix of a graph, and especially - to random matrices. 

Acknowledgement. I am grateful to my supervisor Vitali Milman 
for his support and useful discussions, and for urging me to write this 
note. The mini-courses on Random Matrix Theory taught by Leonid 
Pastur and Mariya Shcherbina (in Vienna and Paris) greatly improved 
my understanding of this field. My father helped me find the way 
in the literature on the problem of moments. Bo'az Klartag, Michel 
Ledoux, Brendan McKay, and Paul Nevai have kindly commented on 
a preliminary version of this note. I thank them all very much. 

1.1. Two definitions and notation. 

Definition 1.1. Let M be an n x n symmetric matrix; let 

Ai(M) < A 2 (M) < • < \ n (M) 

be the eigenvalues of A. The measure /j,m, 

(1) m {S) = #{l<j<n\\ j {M)eS}, Scl, 

is called the spectral measure of M. 

Definition 1.2. Let /i, v be two probability measures on R. The Kol- 

mogorov distance between \x and v is defined as 

d K (fi, v) = sup |/i(— oo, x] — u{— oo, x] | . 
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Notation: Unless otherwise specified, C, Ci, C2, C, c, c', ■ ■ ■ denote 
positive constants not depending on any parameters of the problem. 
Usually, upper case C stand for a big constant, and lower case c - for 
a small constant. 



1.2. Symmetric random matrices. For n£N, let A be a symmetric 
n x n random matrix, such that 



(2) 



A uv are independent for 1 < u < v < n, 

¥{A UV = -l/(2Vn)} = ¥{A UV = l/(2Vn)} = 1/2. 



Theorem 1.3 (Wigner's law). As n — > oo ; the random measures fiA 
converge (weakly, in distribution) to a deterministic measure aw sup- 
ported on [—1, 1]; 

2 

daw{x) = — vl — x 2 dx . 

71 

The measure o"w is called the Wigner measure. 

Remark 1.4 (Precise meaning of convergence). The space M(M) of mea- 
sures on K is equipped with the weak topology. For every n £ N, 
the measure fiA is a random element of M(1R); its distribution is a 
probability measure on M(R). In Wigner's law, these distributions 
converge (weakly) to the distribution 5 aw supported on a single point 
a w eM(M). 

Theorem 1.5 (Furedi-Komlos (TD])- As n — > 00, the operator norm 

\\A\\ = max(|Ai|, |A n |) 
of A converges (in distribution) to 1. 
Wigner's theorem (above) implies that 

P{||A|| < 1 -£} ► 

for any e > 0. As for the complimentary inequality, we prove a stronger 
fact: 

Theorem 1.6 (A. Boutet de Monvel and M. Shcherbina [6]). For some 
(universal) constants c,ai, a 2 ,a 3 > 0, 

(3) P{||A|| > 1 + e} < exp(~cn ai e a2 ) , 

provided that 
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1.3. Random covariance matrices. For n < N, let B be an n x N 

random matrix (that is, B : M> N —>■ R"), so that 

B uv are independent for 1 < u < n, 1 < v < N, 
P {b w = -l/v^} = P = 1/v^} = 1/2. 
Now we are interested in the eigenvalues 

< Ai < ■ • • < A„ 
of the (symmetric) matrix C = BB l . 

Theorem 1.7 (Marchenko-Pastur [18]). If n, N — ► oo so that 

n l N > £ G (0, 1] , 
the spectral measure fie converges (weakly, in distribution) to a deter- 
ministic measure a^ MP supported on [(1 — v^) 2 ; (1 + V^) 2 ]/ 



x\dx. 



daiAx) = (x - (1 - Ve) 2 ) ((1 + VI)' 

The measure cr^p is called the Marchenko-Pastur measure. 

Theorem 1.8 (Geman [IT] . Bai-Yin [5]). If n, N — ► oo so that 

n l N > £ G (0, 1] , 

the smallest eigenvalue of C converges (in distribution) to (1 — y^) 2 ; 
and the largest - to (1 + \f£) 2 ■ 

Remark 1.9. The convergence of the largest eigenvalue was proved by 
Geman, and of the smallest - by Bai and Yin. 

Similarly to the previous subsection, 

P{A 1 (C)>(l- v ^) 2 + e}-^0 

and 

P{A n (C)<(l + v / e) 2 -e}^0 

by the Marchenko-Pastur theorem. As for the complementary inequal- 
ities, we prove the following: 

Theorem 1.10 (|4J). For some (universal) constants c, /3i, 02, 03 > 0, 

(5) P {Ai(C) < (1 - y^) 2 -e) < exp(-cn^e^) , 

(6) P {\n(C) > (1 + v 7 ^) 2 + e} < exp(-cn ft e ft ) , 
provided that 
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1.4. Adjacency matrix of a random (/-regular graph. Fix d > 3; 

let G = (V, E) be a random <i-regular graph on n vertices. That is, G 
is picked uniformly from the collection of all graphs G = (V, E) such 
that j^-V = n and 

# {u G V | (u, v) G E 1 } = d for every v £ V . 

Let -A(G) be the adjacency matrix of G; that is, 



A(G\ 



(u,v) G E 
otherwise . 



Theorem 1.11 (McKay). The spectral measure Ha(g) converges (weakly, 
in distribution, as n — > oo) to a deterministic measure o~km supported 
on 



[-2y/d- 1, 2yd - 1] 



x 2 



(7) da KM {x) = - — ^ — -5 ^ cte . 

27T a' 2 — x z 

The measure o"km is called the Kesten-McKay measure. 



1.5. A guide to the next sections. In Subsection 12.11 we introduce 
the general framework that unites all the problems listed above. In 
Subsection 12.21 we focus on an example,- the infinite <i-regular tree,- 
that should clarify the meaning of the Kesten-McKay measure, and 
also hint the main idea in the proofs of all the theorems. Lemma [2T71 in 
Subsection 12.31 relates the spectral properties of the matrices in study 
to certain combinatorial quantities. 

We apply it in Subsection 13.11 to prove McKay's theorem, and in 
Subsection 13.21 - to prove Wigner's theorem. In Subsection 13.31 we 
sketch the proof of the Marchenko-Pastur theorem. The bounds on 
extremal eigenvalues are the subject of Section HI 

Section [5] recalls some properties of orthogonal polynomials with re- 
spect to measures that appear in this note. In Section [6] we prove the 
combinatorial estimates used in the proofs of the theorems on random 
matrices. These two sections contain the technical results that we use 
elsewhere. 



2. Spectral measure: limit theorems 



2.1. Matrices on graphs. Let G = (V, E) be a graph (with vertices V 
and edges E). A (symmetric) V xV matrix M is called a (symmetric) 



6 



SASHA SODIN 



sign matrix on G if 



M, 



uv 







±1 



(u,v) e E 
(u,v)<£E 



Example 2.1. If 



uv 



+1 

0. 



(u,v) e E 

(u,v)£E 1 



M is the adjacency matrix A(G) of G. 



If the degree of every vertex is finite,- that is, 



deg(v) = #{f G V | (u, v) G E} < +oo 



for every v G V,- the matrix M defines a symmetric operator on a dense 
subspace of L2(V). If moreover the degrees are uniformly bounded by 
a number D, M is self-adjoint and ||M|| < D. 

We are mainly interested in finite graphs (#V < +oo); however, it 
will be convenient to have the definitions in this generality. 

Let us recall the spectral theorem for self-adjoint operators (see 
Akhiezer and Glazman [2]). 

Definition 2.2. A family of projectors {E t | —D <t< +D} is called 

a resolution of identity if 



For our operator M, there exists a resolution of identity such that 
all Et commute with M and 



for any polynomial p. 

The (operator- valued) measure dE t is called the spectral measure of 
M. In some important cases the (real) measure d(E t 5 v ,5 v ) does not 
depend on the choice of a vertex v G V (here 8 v (u) = 5 UV for u, v G V). 
In this case, we also call it the spectral measure of M (more general 
definitions are available for M = A{G)\ see Grigorchuk and Zuk [13] 
and references therein). 



(1) E_ D = 0, E +D -- 

(2) E t E' t = £ , m in(t,t') 

(3) lim E t = E v . 



t-*t'-o 
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2.2. Main example. Denote by Hd = (Vd, Ed) the (infinite) <i-regular 
tree (d > 3); let M be a symmetric sign matrix on Hd- According to 

w, 

(p(M)f,f}= [ p(t)d(EJ,f) 
J-d 

for any polynomial p and any / G L 2 (Vd), and in particular 
(9) (p(M)5 u , 5 U ) = I p(t)d(E t 5 u , 5 U ) . 



Note that the measures d(E t 5 u ,5 u ) do not depend on u (because of 
homogeneity). In fact, these measures also do not depend on M. The 
following fact is essentially due to Kesten [16] : 



Proposition 2.3. The measures d(E t S u , 5 U ) are equal to the Kesten- 
McKay measure o d KM . 

Proof. Define a sequence of polynomials 

(Pk)kez+ = (Pk,d)kez+, degp k = k : 



(10) 



p {t) = l, Pl (t)=t/Vd, 



d 



P2(t)=t 2 /^d(d-l) 

Pk+i(t) = tp k (t) I y/d- 1 - Pk-i{t) (k = 2, 3, • • • ) . 
Lemma 2.4. 

(p k {M)5 u , S u ) = for k = 1,2,3 - ■■ . 

As we shall see (in Lemma [2 .7\ from which our lemma follows), this 
equality expresses the fact that "there are no cycles in Hd" . Now we 
need one more property of the polynomials pk', for proof, see Remark 1 5. 41 
in Section [5] (and the discussion preceding it). 

Lemma 2.5. The polynomials pk are orthogonal with respect to the 
measure a km-' 

f d 

/ Pk(t)pi(t) da KM (t) = 5 k i , k, I e Z + . 
J-d 

In view of (Q and Lemma [2 A\ 

rd 



\ p k (t)d(E t 5 u ,5 u ) = 5 k0 , k e Z H 
J-d 
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Therefore by Lemma 12.51 

Pk(t)d(E t 5 u ,5 u ) = / Pk(t)da^ M (t) 

-d J-d 

for any k G Z + , and hence 

rd pd 

p(t)d(E t 5 u ,5 u ) = / p(t)da d KM (t) 

d J-d 



L 



for any polynomial p. □ 

2.3. Limit theorems for finite graphs. Let G n = (V n , E n ) be a 
sequence of <i-regular graphs, 

(11) N n = #Vn — > OO , 

n— >oo 

and let M n be a symmetric sign matrix on G n . The following questions 
arise: 

(a) Is it true that 

(12) MM n > 0"kM ; 

/or every sequence M n ? 

(b) Does (H2D hold for M n = 

(c) Does f|T2l hold (a.s.) for a random sequence M n (that is, the 
entries of M n are random and independent up to the symmetry 
assumption, 

P{M n ,™ = 1} = F{M niUV = -!} = 1/2, (it, v) & E ?) 

(d) Does the average spectral measure E/iM„ (with respect to the 
random choice of M n as in (c)) converge to cr^ M ? 

It is easy to see that (a) =^> (6) and (a) (c) (d). In fact, all 
the 4 are equivalent. 

Denote by C/.(G) the number of closed paths (m , u%, ■ ■ • , — u ) in 
G, such that (uj^%,Uj) G E for 1 < j < k, and 7^ u (j+2) mod fc for 
1 < ./ < /,-. 

If the numbers Ck{G) are small, G looks locally like a tree; hence the 
spectral properties of matrices on G should resemble those of matrices 
on H d (cf. Proposition 12.31) . This is indeed the case; the following 
proposition generalises the result of McKay [19] on adjacency matrices 
(see also Serre 



Proposition 2.6. For every one of the questions (a)-(d), the answer 
is positive iff Ck{G n ) / N n — > for k = 1, 2, • • • . 
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To prove the proposition, we need some notation. Let 

ZD uv (k) = %8 uv (k, G) = {(u = u, ui, ■ ■ ■ , u k = v) | u j+1 ) e E} 

be the collection of paths from u to v in G. Consider the subcollection 

2B««(fc) = {(u , ■ ■ ■ , u k ) e W uv (k) | Uj ^ uj-2 for j > 2} 

of non-backtracking paths, and the subsubcollection 

WZ en (k) C W uv {k) 

of paths on which every edge appears an even number of times. 
Finally, denote 

'W(k,G) ={J ueV W uu (k,G), 
W(k,G) ={J ueV W uu (k,G), 
W c ^(k,G) =\J ueV WZ ea (k,G) . 

Lemma 2.7. Let G = (V, E) be a d-regular graph and let p k = Pk,d be 
defined as in ^W^). 

(1) For any symmetric sign matrix M on G, and any u, v 6 V, 



(13) p k {M) uv = (p k {M)S u , 6 V ) 



Vdid-i)^- 1 )/ 2 



where the sum is over (m ,Mi, • • ■ ,u k ) G W uv (k). 
(2) In particular, 

(14) \( Pk (M)5 u , 5 U )\ < * W "" a ' ] 



v/d(d-l)(*-l)/2 



with equality for M = ±A(G). 
(3) For a randomly chosen M , 

n yyieven/ K\ 

(15) E(p k (M)5 u , 5 U ) - ' 



Vd(d- i)( fc -i)/ 2 ' 

Proof. 

(1) For k — 1, the statement is trivial. Next, 
P2(M) UV = -— L= (M 2 - dl) w 

(16) (^==^E W M UW M WV , u^v 

(E w ML~d) = 0, u = v 



\/d{d-l) 
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On the other hand, 



{(u,w,v) I [u, w), (w, v) G E} , u^ii 
0, u = v ; 



therefore the right-hand side of (fT3|) for fc = 2 is equal to the 
right-hand side of (fl6l) . 
Now proceed by induction. 

(2) Follows immediately from 1. 

(3) Take the expectation of both sides of ( 1T31) and observe that 
if s±, ■ ■ ■ ,S{ are random signs drawn with replacement from a 
collection & of independent random signs, then 

{every term s G & appears an even number 
of times in the product (0 is even!) 
0, otherwise. 

□ 

Recall the following fact (cf. Feller (9, Ch. VIII, §6]): 
Proposition. Let (fi n ) be a sequence of probability measures such that 

x k dfi n (x) — > J x h dfi(x), k = 1,2, 3, ••• , 

where /i is a probability measure with compact support. Then 

Now Proposition 12.61 follows from the next lemma: 
Lemma 2.8. Let G n = (V n ,E n ) be a sequence of d-regular graphs, 

Wn — OO . 

n—>oo 

The following are equivalent: 

(1) For any fcGN, 

#W(k,G n )/Wn^Q 

as n — ► oo. 

(2) For any fcGN, 

#W even (fc,G ft )/#K— ^0 . 

(3) For any fcGN, 

C k (G n )/Wn — > . 
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Proof. First, W even (k,G) C W(k,G); hence 

#2B even {k,G) < #2n(fc,G) 

and 1 =>- 2. Similarly, c^iG) < W even (2k,G) (just concatenate a 
closed path to itself), and so 2 =^> 3. Finally, 

#W(k,G) = c k {G) + (d-2)(d-l) r - x c k -2r(G); 

l<r<fe/2 

therefore 3 ==>- 1. 

□ 

3. Spectral measure: proofs 

3.1. McKay's theorem. Let (G n ) be a sequence of random (i-regular 
graphs: G n is chosen uniformly from the collection of all (i-regular 
graphs on n vertices; let M n be a symmetric sign matrix on G n . 

Proposition. For any fceN, Ck{G n ) — > in distribution as n — > oo. 

This proposition was first proved by Wormald; see also McKay, 
Wormald and Wysocka [21] and the discussion below. 

Corollary 3.1. Let M n be an nxn symmetric ±1 matrix, n — 1, 2, • • • . 
If ' _ 

M n = M n . A(G n ) 

is the Hadamard product of M n and A(G n ),- that is, 
then 

HM n — ► °"km 
weakly, in distribution, as n — > oo. 

In particular (for M n)UV — 1, 1 < u, v < n), we recover McKay's 
theorem formulated in Subsection ll.4( this is very similar to the original 
proof in [19] . 

Now we aim for an estimate on the rate of convergence. 
Lemma 3.2. Let n be a probability measure on R such that 



(17) 
Then 



Pk,d dfi 



<e k , 1 < k < 2m - 2 



where C > is a universal constant. 
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The case e\ = ■ ■ ■ = e-im-i — follows from the Chebyshev-Markov- 
Stieltjes inequalities (cf. Akhiezer [I]); we present the proof of the gen- 
eral case in Subsection 15.21 (see Proposition 15.61 and Remarks 15. 7)5. 8ft . 

Definition 3.3. The girth 7(G) of a graph G is the size of the smallest 
closed cycle in G. In other words, 

7(G) = min{fc | c k (G) > 0} . 

The following proposition was proved by McKay [19] with a slightly 
weaker estimate, and later by Li and Sole [17] using the argument that 
we reproduce here. 

Proposition 3.4 (McKay, Li-Sole). Let G be a d-regular graph. Then 

C 

dK{^A{G)^ KM ) < —7-^- , 

where C' > is a universal constant. 
Proof. By Lemma [2.71 

J Vkd[i A {G) = ^Pk{\{A{G))) / n 

= trp k (A)/n = #2B(k, G) / (nVd(d - l)^- 1 )/ 2 ) = 

for 1 < k < 7(G). Therefore by Lemma [3.21 (with all equal to 0) 

C 

dK(HA(G),o- KU ) < ^Qy2 ' 

□ 

Remark 3.5. Obviously, the last proposition is valid for any symmetric 
sign matrix M on G. 

Unfortunately, the girth of a (typical) random (i-regular graph is 
0(1); therefore the proposition is not applicable. To obtain a mean- 
ingful bound in McKay's theorem for random graphs, we use the full 
strength of Lemma [372l as well as the estimates on #2U(/c, G) that can 
be extracted from the work of McKay, Wormald and Wysocka [21J. We 
omit the details that lead to 

Proposition 3.6. Let G be a random d-regular graph on n vertices. 
Then 

dMA(G)),a d K )<C 
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with probability 1 — o(l) (as n — > oc), where C > is a constant 
independent of d and n. Moreover, with probability 1 — o(l), 



dMM),a d K )<C* 



\ogd 
logn 



for all sign matrices M on G (simultaneously) . 

3.2. Wigner's law. Let A be a random nxn matrix, as in ([2]). Then 

A = A/ \fn + D , 

where A is a random symmetric sign matrix on the complete graph 
K n (every two vertices are connected by an edge), and D is a diagonal 
matrix, 

(18) ||D|| = l/(2y/n) . 



We will show that 

(4) 



(3) 



(2) 



a 



KM 



where o"k M is the Kesten-McKay measure scaled to [—1, 1]: 
j~d i \ j d /o /j — T" \ 2d(d-l) VI - x 2 dx 

Step 1: Let d > 3. Then 



7T d 2 - 4(d - l)x 2 ' 



2d(d — 1) Vl 



x- 



7T 



d 2 -A(d-l)x 2 71 



ar 



dx 



< 



< 



d? - 4(d - l)x 2 
1 |d-4(d- l)x 2 | 
.! d 2 -4(d- l)x 2 



1 



x — Vl — x 2 dx 

7T 



x — Vl — x 2 dx 

71 



3d 



< C/d 



(d-2) 2 

for some universal constant C > 0. 
In particular, 

^kKmVw) < Ci/n . 
Step 2: Observe that 

Now we are in the familiar setting of symmetric sign matrices on a 
graph. 
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First consider the average spectral measure E/i j. By Lemma 13.21 
(19) d K (Efji x ,a^) 



2m- 2 r 



\ k=l 



Pk,n-i(x)dE(jtz(x) 



< C ^1/m + m 6 
we will take m = cn 1 ^ 10 . By Lemma \'2.7\ 

n 

Pk,n-i(x)dEfi A (x) = ^E(p fcjn _i(v4)5 n , 5 u )/i 

u=l 

Obviously, 2B even (fc, K n ) = for odd k, whereas for even k 
#%3 even {k,K n ) < Ckn k/2 /4 < Ckn k/2 
by Proposition 16.21 (that we prove in Subsection 16.21) . Hence 

(20) < j p k>n ^(x)dEii A (x) < Ck/n . 

By f fT9l . we have proved that 

(21) ^(E/^a^^d/n 1 / 10 
and therefore 

(22) rf K (E/i I/v ^ 3T , cr W ) < d K (^A/Vii=ii ? Km) + ^f^KM- °"w) 

= d K (Efi A , a^) + d K (a$£, cr w ) < C./n 1 / 10 . 

Steps 3 and 4: It remains to recall ({TBI) and deduce 

Proposition 3.7. There exists a universal constant C such that, for 
a random matrix A defined by 



(23) 



d K (Efi A ,a w ) <C/n 



1/10 



With some more effort, it is possible to prove a slightly stronger 
proposition: 

Proposition 3.8. There exists a universal constant C such that, for 
a random matrix A defined by (0|) ; 

(24) d K {^ A ,a w )<C/n 1 ' 10 . 

with probability 1 — o(l) (as n — > oo). 
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Remark 3.9. Gotze and Tikhomirov proved [H] that the left-hand sides 
of both (1251) and fflM are not greater than C / \/n; however, their argu- 
ment is much more intricate. 

3.3. Marchenko— Pastur law. Let B be a random n x N matrix, as 
in (Tj0). Define an (n + N) x (n + AT) matrix S in the following way: 

Vv^tf J 

Then 5 is a symmetric sign matrix on the complete bipartite graph 

V n , N = {l',--- ,n',l",--- ,N"}, 

E n , N = {(«', v") | 1 < u < n, 1 < v < N} . 

The graph K HjN is not regular (unless n — N); however, it is bi- 
regular (of bi-degree (N,n)). 

Definition 3.10. A graph G = (V U V", E) is called bi-regular (of 
bi-degree (d',d")) if 

(1) E cV x V" 

(2) The degree of every vertex v' G V equals d', and the degree of 
every vertex v" G V" equals d" . 

Li and Sole proved p2] an analogue of Lemma 12.71 for bi-regular 
graphs and used it to recover the spectral measure of the bi-regular 
tree (first computed by Godsil and Mohar [12]), and to show that the 
spectral measure is not far from it for finite bi-regular graphs of large 
girth, and for random bi-regular graphs. Here we focus on the limiting 
case n, N — ► oo. 

Let 

6 = (n-2)/AT £2 = (n-l)(AT-l)/Ar 2 ; 

note that £1,^2 — ► £ under the assumptions of the Marchenko- Pastur 
theorem. Define a sequence of polynomials = qk,£ u &' 

q (t) = i, gi(t) = (t-i)A/6, 

q k+ l(t) = (t - 1 - ii)q k {t)/ a/^ - Qk-i(t) . 
Lemma 3.11. 

(1) The polynomials q k are orthogonal with respect to a certain (ex- 
plicit) measure Oqm supported on 



[i-2V6 + £i,i + 2V6 + £i] 
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(2) Ifn,N — > oo so that n/N — > £ ; the measure converges weakly 
to the Marchenko-Pastur measure o~^ MP . Moreover, 

i I n,N £ 
« K \ 0~ r < A/r , CT » 



) < C/n . 



K \ °GMi ° MP 

Sketch of proof. Both facts can be deduced from an explicit formula for 
<Tq M , that follows from Bernstein-Szego formulae in Subsection 15. II (cf. 
Li and Sole $7\). □ 

Remark 3.12. For fixed k, 

qk,£,£ are orthogonal with respect to <7^ P according to Example 15.51 in 
Subsection 15.11 Therefore the convergence in [2] can be seen without 
writing the explicit formulae for o~qm- 

The following lemma is an analogue of Lemma 12 .7\ the proof is anal- 
ogous. 

Lemma 3.13. If M is an n x N matrix the entries of which are equal 
to ±1, then 

q k (MM /N) uv = , 

where the sum is over 

(u' , u'l, u[, u'z, • • • , vf k _x, ul, u' k ) e 2U™(2A;, K n)N ) . 

Now, 

y qkdE/j, c = n^tiq^C) 

n 

= 40u>u>(2k, K n , N ) I {n{nN) k ' 2 ) 

<#m2k,K niN )/ ' {n(nN) k ' 2 ) . 

For k < c^ 3 / 20 ?! 1 / 10 , the last quantity is bounded by 

Ck/n 

according to Proposition 16.41 

Proceeding as in the previous subsection, with the general Propo- 
sition 15.61 (and the following remarks) instead of Lemma 13. 2\ we can 
deduce the following form of the Marchenko-Pastur theorem: 
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Proposition 3.14. Under the assumptions of the Marchenko-Pastur 
theorem, 

d K (E^ c ,ai P )<C/(e /20 n^ ) ; 

moreover, 

d K (^ai P )<C'/ (e /20 n 1/10 ) 
with probability 1 — o(l). 

Remark 3.15. For £ bounded away from 1, Gotze and Tikhomirov 
proved [15] a better estimate C/n 1 / 2 for the left-hand sides in these 
inequalities. 

4. Extremal eigenvalues 

4.1. Preliminaries. In the previous sections, the convergence of the 
spectral measure [i An — ► cr followed from the convergence 

(25) Jp k dfx An ^0, k = 1,2,3..., 

where P& are the orthogonal polynomials with respect to a. 

To obtain convergence, we only needed (1251) to hold for (every) fixed 
k. However, in some of the examples, the integral on the left-hand side 
of (l25l) is small also for k growing with n. If this is the case (for k 
growing fast enough), no eigenvalues of A can lie far from the support 
of a. We formalise this observation in this section. 

Bai and Yin [5] applied a similar method (in implicit form) for ran- 
dom covariance matrices. In [4], exponentially decaying estimates on 
the probability of deviations were obtained for this case, using the 
method Bai and Yin and a formalism similar to that of the present 
note. In particular, Subsection 14.31 reproduces some of the results in 
j4] (correcting minor errors and misprints). 

4.2. The Fiiredi-Komlos theorem. Let A be a random matrix de- 
fined as in As in the first paragraph of Subsection 13.21 

A = A/VK + D , 

where A is a random sign matrix on the complete graph K n and ||.D|| < 
ijly/n. Recall the estimate ([20]): 

n 

< Ej2Pk,n-i(\(A)) <Ck, k< cn l ' w . 
i=i 
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By Chebyshev's inequality, 

(26) P j^p^^I)) > iX < Ck/L, L > . 

Now, pk, n -i are orthogonal with respect to the measure cr^ 1 sup- 
ported on \—2y/n — 1, 2y/n — 1]. Therefore, for large k, tend to 
infinity very fast outside this interval. More formally, we have the 
following 

Lemma 4.1. There exists a universal constant C > such that the 
inequalities 

(1) mf tmPk (t) > -Ck; 

(2) inf|t|> 2 ^T(i +£ )Pfe(*) > exp(C- 1 A; v ^) 
hold for any even k > 2 and any < e < 1 . 

These estimates follow from the formulae in Example 15.31 combined 
with fl29|) . 

Suppose A has at least one eigenvalue outside 

(-2y/n - 1(1 + e), 2^n - 1(1 + e)) , e > C 2 \og 2 n/k 2 . 
Then, by the above lemma, 

n 

J2Pk,n-i(HA)) > expiC-'k^) - C(n - l)k > exp(C^ky/e) . 

i=l 

According to (1261) . the probability of this event is at most 

Ckexp(-C^k^) < expi-C^k^) . 

Taking k = 2 [crz 1 / 10 / 2J and recalling ( Tl8l) . we obtain the following 
quantitative form of the Furedi-Komlos theorem: 

Theorem 4.2. Let A be a random symmetric nxn matrix (as in $B)); 
let 

C log 2 n/n 1/5 < e < 1 . 

Then 

(27) P{||A|| > 1 + e} < exp(-C~ 1 n 1/10 v ^) ; 
here C > is a universal constant. 

In particular, we recover Theorem 11.61 with a\ = 1/10, a 2 = 1/2, 
a 3 = 0.0999. 

General concentration results yield an improvement ot\ = 1, a 2 = 
2; this was brought to our attention by Michel Ledoux. The formal 
argument is as follows: 
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Theorem 4.3. Let A be a random symmetric nxn matrix (as in 
let 

d\og 2 n/n 1/5 < £ < 1 . 

Then 

(28) P{p|| > 1 + e} < exp(-C^ne 2 ) ; 

here C\ > is a universal constant. 

Proof. By (J27J) with e = C log 2 n/n 1 ^ 5 , the median of \\A\\ is rather 
close to 1: 

Med ||A || < 1 + C\og 2 n/n 1/5 . 
Therefore by the result of Alon, Krivelevich and Vu [3], 

P{PII > 1 + C\og 2 n/n 1/5 +e} < 8 exp(-n£ 2 /32) . 

□ 

Remark 4.4. The original proof of Boutet de Monvel and Shcherbina [B] 
yields a>i = 1/2, ct 2 = 3/2, a 3 = 0.333. The estimate f[2"8"j) with slightly 
better constants can be also deduced from a corresponding estimate 
for Gaussian matrices. 

4.3. Bai— Yin theorem. Proceed similarly to the proof of the Fiiredi- 
Komlos theorem. According to Subsection 13.31 

n 

E^2q k (\(C))dfic<Ck 

i=i 

for k < c^ 3 / 20 ^ 1 / 10 ; hence 

P q k {Xi{C))dii C > £ j < Ck/L . 

Lemma 14.11 extends verbatim: 

Lemma 4.5. There exists a universal constant C > such that the 
inequalities 

(1) inft 6M q k (t) > -Ck; 

(2) inf| t _ Wl |> 2 ^ (1+£) q k (t) > expiC^ky/e) 

hold for any even k > 2 and any < e < 1 . 

Now assume C has at least one eigenvalue outside 

[(i-Ve) 2 -e,(i + \/£) 2 +£] • 

Then 

Tl 

^qk{\{C))d^ c > exp(C~ 1 A;v / i7^) - C x kn > exp(C^ky^e/l) 

i=l 
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if £ > log n ^ rp^g probability of this event is at most 

CJcexp^-C^ky/e/i) < exp(-C^k^/I/l) . 
We have thus proved 
Theorem 4.6. The probability that C has eigenvalues outside 

[(i-7e) 2 -£,(i + Ve) 2 +£] 



is at most 
for 



exp(-c- i r 7/20 ^ 1/10 ^ 1/2 ) 

C^/20 log 2 n 



< e < 1 



n i/w 

In particular, we recover Theorem 11.101 with j3\ = 1/10, fa = 1/2, 
P 3 = 0.0999. 



Remark 4.7. Similarly to the proof of Theorem 14. 3[ general concentra- 
tion results yield an improvement Pi = 1, Pi = 2 in (jSJ); this follows 
from the result of Meckes [22]. We are not familiar with a corresponding 
argument for (JSJ). 

5. Bernstein-Szego measures 

5.1. Some formulae. In this subsection we explain how to compute 
the orthogonal polynomials with respect to the measures we encounter. 
The formulas we need follow from some more general formulae, first 
proved by S. N. Bernstein and G. Szego (see Szego [221 Theorem 2.6]). 

Recall that the Chebyshev polynomials Uk(x) (of the second kind) 
are defined as 

, s , sm((k + 1)9) 

(29) U k (cos9) = — vv ' 1 , keZ. 

sin 9 

The following recurrent relation is well-known and easy to verify: 

2xU k (x) = U k+ i(x) + U k -i(x) . 

Proposition. Let a be a measure supported on the segment [—1,1], 
such that 



, . . 2 vl — x 2 dx 

aa(x) 



7T 7 2 (a 2 + (1 - p) 2 ) + 2a(l + P)x + APx 2 ' 

where 7 > and a,^ 6 R are such that the denominator is strictly 
positive on [—1,1]. Then the polynomials P k (x), 

'j(U k (x) + aU k -i(x)+pU k - 2 (x)), k>0 
JLpfrkW + aUk-iW+pUk^x)), k = 0, 



(30) P k (x) 
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are orthogonal with respect to a: 



P k (x)Pt(x)da(x) = 5 M , k, I > . 



Remark 5.1. P k are linear combinations of U k and hence satisfy 
(31) 2xP k (x) = P k+1 (x) + P k -!(x), k = 2, 3, - • • . 

Example 5.2. If a = /3 = and 7 = 1, then 

2 

cfcr(x) = da-w(x) = —\n—x^dx 

is the Wigner measure; 

P k {x) = U k {x), k = 0,1,2,- ■• 



Example 5.3. Let a = 0, /3 = -(d- l) -1 , and 7 = y/(d - l)/d. Then 

da(x) = ^ M (x) = d2 _ V ^\ )x2 dz 

is the scaled Kesten-McKay measure; 

k = 

¥W-^^£4-2(>), fc= 1,2,3,-.. 



Remark 5.4. Note that p k ,d(x) = P k (x/2\ / d — 1) (in view of fl3T|) . this is 
easy to prove by induction). Therefore p k ,d are orthogonal with respect 



to ai M . 



Example 5.5. If 7 = 1, a = y/y, and (3 = 0, then 



2 vT 



rfcr(x) = daL-pix) = — — X ■= dx 

K ' MPK J vr(l + + 2v^x 

is the scaled Marchenko-Pastur probability measure; 



1, k = 

U k (x) + V|C4-i(x), fc = 1,2, 



Hence ^,5,5 are orthogonal with respect to cr^p. 
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5.2. A proposition in the spirit of P. L. Chebyshev, A. A. Mar- 
kov and T. J. Stieltjes. Let a be a probability measure on [—1, 1]; 
let P , Pi, ■ • • be the sequence of orthogonal polynomials with respect 
to er, so that 



Pk(?) = lkX k + • • • , 7fc>0. 

Denote 

k 

B k = max \Pk(x)\, p k (x) = 1 / P(x) 2 , b k = max p k (x) . 

-1<i<1 / Z — ' -1<i<1 

This section is devoted to the proof of the following proposition. 
Proposition 5.6. Let p be a probability measure on R such that 



P k dfi 



<£k, 1 < k < 2m - 2 



(32) 
Then 

d K {^ a) < 2b m _ l + (1 + m% 2 m _ x B^ 



in I 



2m- 2 



\ k=l 



This proposition is a "stability version" of the Chebyshev-Markov- 
Stieltjes inequalities (that correspond to S\ = e-i = • • • = S2m-2 = 0). 
We learned some of the ideas in the proof from the work of Nevai |23j . 

Several well-known statements are stated further without proof; these 
statements are marked with an asterisk. The reader may find the proofs 
in the books of Akhiezer P Ch. Ill] or Szego [25, Ch. II]. 

Remark 5.7. For every measure a that we encounter in this note (or, 
more formally, for probability measures in the class considered in the 
previous subsection), 

b m < C /m and B m < Cm. 

Therefore for these measures (1321) implies 



d K (<r, P)<C (l/m + m 6 y/jyi\ 



Remark 5.8. Taking a = cr^ IK and scaling, we recover Lemma [3.21 
Proof of Proposition \5.6l Let 
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be the zeros of P m . Choose 1 < s < m and construct two polynomials, 
R and S, both of degree at most 2m — 2 and such that 



R{^i,m) — ■ ■ ■ — R(n S:rn ) — 1, 



(33) 



and 



(34) 



0. 



R\^s-\-l,m) ' ' ' R(^m,m, 
R'(l^l, m ) = ■ ■ ■ = R'(K s -l,m) 

= -R'(/€ s _)_i )m ) = • • • = R\Hrn,m) = 0, 



S{Ki )m ) = ■ ■ ■ = 5(/s; s _i im ) = 1, 
S\Ks,m) = ■ ■ ■ = 5 , (/€ mm ) = 0, 

S \K\,rn) — ' ' ' — S (k s — l,m) 
= 5" (K s +l )m ) = " " 



S (ftm,m) — . 



Lemma* (Markov-Stieltjes). The inequalities 



R > l(-oo, Ks , m ] > l(-oo,/t a>m ) > S 



hold. 



By the lemma, //(— oo, K s>m \ < f Rdji. Expanding R = ^ fc = a k P\ 
(where a k = J RP k do~), 

J Rdfi = ^ a fc J P k dfx 

(35) 2m-l 

< a + |a fc |£ A < a + 

Now, 



2m-2 

E 



^ fc=l ^ fc=l 



2m- 2 



(36) 



2m-2 



2m-2 

E' 

fc=0 



7 R2da - v / i (-».«--] d °' + v / (jR ~ 5)2At • 



since definitely i? < l(-_oo, Ks>m ] + (-R - 5). 

By ( 13511511) , R — 5 is a square of some polynomial p of degree m — 1; 

p(Kt,m) = 5 at , 1 < t < m . 

Therefore p = £ Sjm is s-th Lagrange interpolation polynomial of order 
m. 



24 SASHA SODIN 

Lemma 5.9. For — 1 < x < 1, 

\£s,m(x)\ < m 2 b m ^ x B 2 m . 

Proof of Lemma \5.9i . We start from an expression for i s ^ m that the 
reader may find in Szego [221 Chapter XIV]: 

p I \ 7m- 1 / \ p / \ Pm(x) 

*"S,myE) Pm— 1 l^^m-J m— 1 \^s,m) • 

7m 

Let us estimate the terms one by one. First, 

7m- 1 



7r> 



y xP m - 1 (x)P m (x)da(x) < J J P^da^J J P^da = 1 . 

Then, p m -i{^s,m) < b m -i, \P m {^s,m)\ < B m . By the Lagrange mean- 
value theorem and A.A.Markov's inequality (see for example Todd [26] ) 

/ ' < max \P' m (y)\<m 2 max \P m (y)\ = m 2 B m . 



The lemma is proved. □ 

Now recall the Gauss-Jacobi quadrature formula. 

Lemma* (Gauss-Jacobi quadrature). For any polynomial q of degree 
not greater than 2m — 1, 



/m 
pda = p m -l(Ki,m)p(Kj 
1=1 



Applying (I35H36I) . Lemma [5791 and the Gauss-Jacobi quadrature, we 
obtain: 



K-oo, K s , m ] < j Rda + (1 + m'bl^J 



Pm-1 (^,m) + (1 + m 4 b 2 m ^B^) 4 ■ 



Similarly, 



i=l 



s-1 



p{-00, K s , m ) > Pm-l{Ki,m) ~ (1 + 771%^^) ^ . 

t=l 

The measure a satisfies the assumption (1321) with £j = 0; therefore 

s— 1 s 

^ ] Pm-l( K i,m) ^ 0"( — OO, ft S)m ) < 0"( — OO, ft S)m ] < ^ ] Pm-l( K i,m) • 

i=l i=l 

The claim of the proposition follows. □ 



random matrices, non-backtracking walks ... 25 

6. Counting non-backtracking paths 

This section follows [4] (where walks on the complete bi-partite graph 
were considered, cf. Subsection 16 .31) ; we have corrected minor errors and 
misprints. 

6.1. Fragments. Let G = {V,E) be a graph, and let 

K> = («*,•••) eW cvcn {2k,G) . 

Consider ro as a set of triples {(u,v,r)\l <r< 2k}, meaning that the 
rth edge of in goes from u G V to v G V. 

Divide the edges into 3 classes. If e G to is the first edge to visit a 
vertex v G V, we will write e G Ti. More formally, 

Ti = {(u, v, r) G xo | Vr' < r, (V, i/, r') G to =^> u ^ {«', v'}} . 

The path to is even, therefore for every e G to there will be another 
edge in to, coincident with e. Denote 

TJj = {(u, v, r) G tt) | 3 ! r' < r, (w, t>, r') G Ti V (t> , m, r') G Ti} . 

Finally, let T 3 = tu\(Ti U T 2 ). 

A sequence of vertices / = (ux, ■ ■ ■ ,ue) (£ > 1) is called a proto- 
fragment of to if the following 3 conditions hold: 

(i) for some r 

(ux, u 2 , r), (u 2 , u 3 , r + 1), • • • , m^, r + £ - 1) G T x ; 

(ii) for some r'(> r) 

either (u x , u 2 , r'), (u 2 , u 3 , r' + l), • • • , w^, r' + £-1) G T 2 
or (^,M^_i,r'), • • • , (u 3 ,u 2 ,r'+£-2), (u 2 ,Ut,r'+£-l) G T 2 ; 

(iii) / is maximal with respect to (i)-(ii). 

If / is a proto-fragment, u\ ^ u*, we call its suffix / = {u 2 , ■ ■ ■ ,ue) 
a fragment of length £ — 1. If U\ = u*, we call / itself a fragment of 
length t. The vertices on ro are thereby divided into F fragments. 

Lemma 6.1. F < 2#T 3 + 1. 

This inequality holds for any graph G, as one can easily verify. 

6.2. The complete graph. 

Proposition 6.2. There exist two constants C, c > such that, for 
k < cn 1/w , 

#W even (2k, K n ) < Ckn k . 
The following lemma is obvious: 
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Lemma 6.3. The number of different fragments of length I in K n is 
not greater than n l . 

Proof of Proposition \6.£\ First choose the number S of distinct vertices 
on ro. Then choose the lengths of the fragments: this can be done 
in (p^J < S F I F\ ways. Next, choose the fragments themselves; by 
Lemma 16.31 this can be done in < n s ways. 

There are 2 F possibilities to orient the fragments in T-i- Now glue the 
oriented fragments onto the path; this can be done in (2k — 2S + 1) 2F 
ways. 

Every one of the remaining 2k — 2S vertices coincides with one of the 
S vertices on the fragments. Therefore there are < S 2k ~ 2S possibilities 
to arrange these vertices. 

Therefore 

#W even (2k } K n ) < ^n s 2 F (2k - 2S + l) 2F S 2k - 2S 



F\ 

S,F 



< n k 



S,F 



2 ^ F / C2\ k -S 



fcs(k-s) 2 y fs' 

n 



Now, F < 2#T 3 + 1 = 4k - AS + 5; the function x h-> (y/x) x is 
increasing on [0, y/e\; therefore 

#W cven (2k, K n ) <n k J2 ( C i S ( k ~ S)f k ~ S) (—^ 

Q TP \ ^ 



< n k 



S,F 



S,F 

for k < cn 1//10 . 



□ 



6.3. The complete bipartite graph. 



Proposition 6.4. There exists two constants C, c > such that, for 
k < ce /20 n l l w , 

# eM (2fc, K n>N ) < Ck(nN) k l 2 . 

The following obvious lemma replaces Lemma 16.31 

Lemma 6.5. The number of different fragments of length £ in K n ^ is 
not greater than 

2^/N/n~(nN) e/2 . 
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Proof of Proposition \6.4\ Similarly to the proof of Proposition 16.21 

#2^(2*,^,*) 
S F 



< J2 ^(2^Njn) F (nN) s l 2 2 F (2k -2S + l) 2F S 

S,F 

<{nN)*r( C W-*y(*) 
C x S{k- S)^ 4(k - S] ' ? 2 



2k~2S 



< n 



N )k/2J2 



S,F 



( s 2 v 



< 



4 \ \ 4(fc-S) 



< C 2 k{nN) k/2 



if Jfe < cf^n 1 / 10 . 



□ 
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